Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

silu_and_mul fused moe #208

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

Chi-Chu319
Copy link

silu_and_mul fused moe

Migrated from ROCm/triton#710

@Chi-Chu319 Chi-Chu319 requested a review from rahulbatra85 March 17, 2025 10:22
@rahulbatra85
Copy link
Contributor

@Chi-Chu319 Can you please re-trigger the CI?
Weird that it failed for PA and RMSNorm tests

# Calculate new pid based on the new grouping
# Note that we need to consider the following two cases:
# 1. the current pid is on a tall xcd
# 2. the current pid is on a short xcd

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, can I get an example how a block is mapped onto the 8-die chip ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same is in https://github.com/ROCm/triton/blob/main_perf/python/perf-kernels/gemm.py#L110

Example of a kernel with 100 pids:
The pids are assigned to the XCDs in a round robin fashion, so pid 0 goes to XCD 0, pid 1 goes to XCD 0. So on a so forth.
In the end, XCD 0,1,2,3 gets 13 pids and XCD 4,5,6,7 gets 12 pids

remapping permute the pid sequence so that

PID:  [0, 1, 2, 3, 4, ..., 99]
         |    |   |   |   |       |
XCD: [0, 1, 2, 3, 4, ..., 3]

is mapped to

PID:  [0, 13, 26, 39, 52, 64, 76, 88, 1, 14, 27, ..., 99]
         |    |      |      |     |    |    |     |    |    |     |        |
XCD: [0, 1,   2,    3,    4,  5,   6,   7,  0,  1,   2, ...,  3]

So e.g. before XCD 0 gets pid: [0, 8, 16, ...], XCD 1: [1, 9, 17, ...] after the remapping XCD 0: [0, 1, 2, ...], XCD 1: [13, 14, 15, ...]. So XCDs only work with adjacent pids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants